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Abstract 


, Human operators are increasingly being called upon to func¬ 
tion as monitors of automatic systems. System monitors, as 
opposed to active controllers, do not necessarily experience 
lower workload levels during task performance. In fact, 
prior research has suggested that workload demands may not 
be reduced but rather shifted to a functionally separate 
processing "pool" according to a structure specific view 
of human attention. Sternberg’s additive factors method may 
provide a useful workload assessment technique for localizing 
the information processing demands of task performance. The 
present study couples a primary failure detection task with 
a secondary Sternberg task which employed a perceptual and 
response load manipulation. The results demonstrated a 
significant overlap of processing resources for the failure 
detection task and the Sternberg perceptual condition. For 
the response load condition, there was no evidence cf shared 
resources between the two tasks. These results have signif¬ 
icant implications for task configuration and workload as¬ 
sessment research. /, 
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INTRODUCTION 


The role of the human operator has undergone dramatic revision in 
recent years with the continued encroachment of automation into the human 
domain of man/machine systems. The human operator's function has steadily 
evolved from that of a manual controller of dynamic systems to the role of a 
monitor and supervisor of automated systems. Unfortunately, this evolution, 
in many cases, has not been accompanied by a corresponding development of 
sensitive research methods to investigate the qualitatively different 
demands vrfiich are placed on human monitors. Automation will never eliminate 
operator effort and its introducticfn into the task situation may not 
necessarily serve to reduce the load experienced by human monitors. The 
purpose of this study is to further explore the information processing 
demands of monitoring automatic systems. First, however, the research 
literature on failure detection and workload assessment will be reviewed to 
provide an introduction to theoretical and methodological considerations. 

Human Monitoring and Failure Detection 

The task of monitoring system dynamics may encompass several behavioral 
objectives including failure detection. However, the actual objectives of 
human monitors will be difficult to specify in complex systems where many 
possible objectives exist (Curry, 1979). The more demanding supervisory 
function may also entail such activities as failure detection, 
identification, and corrective action on the part of the human operator 
whose responsibility it is to monitor and control large panels of 
instrumentation (Rasmussen, 1968). Detecting system changes or failures, 
vftether in a supervisory or purely monitoring capacity is, therefore, a 
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critical aspect of modern human operator behavior. 

In the loop vs. out of the loop . During the past two decades, failure 
detection researchers have considered the issue of whether humans are better 
detectors of system failures when they are in the loop exercising active 
control over the system or out of the loop as passive monitors. Proponents 
for active human control argue that increased vigilance and faster 
adaptation (e.g. taking manual control of failed automatic system) are 
logical grounds for keeping the human in the loop. On the other hand, 
out-of-the-loop advocates consider the increased capacity for processing 
information from other sources and the possibility of designing systems to 
perform functions beyond the capabilities of human operators as valid 
reasons for retaining the human operator as a passive monitor. 

The results reported in the experimental literature on this issue have 
been inconclusive in establishing any clear cut superiority of one mode over 
the other. Curry and Eprath (1976) report that most previous investigations 
into the adaptability of the human operator have concentrated on sudden and 
usually severe step changes in control element dynamics. However, even 
those experimental investigations which have utilized more subtle changes in 
system dynamics as "failures" have not resolved the issue. 

Wickens and Kessel (1979a) compared th/? effects of two modes of 
participation on failure detection performance. In the manual (MA) mode, 
subjects were required to control the system in tracking a two-dimensional 
pursuit display. Operator input was perturbed by Gaussian noise which was 
analogous to the buffeting effects of wind gusts on an aircraft. In the 
autopilot (AU) mode, the human controller was replaced by an autopilot which 
simulated human control input, reducing the operator's role to that of a 
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system monitor. Operators in both modes were required to detect failures 
which were relatively small step increases in system order. The results 
indicated that the MA mode displayed superior failure detection performance, 
both in terms of accuracy and latency. 

On the other hand, Eprath and Curry (1977) investigated the effects of 
gust disturbances and the pilot's participation mode on failure detection 
performance during a simulated, low visibility landing approach. They 
reported that participation mode had a significant impact on human detection 
of subtle, slow failures in the lateral and pitch axes. A failure in the 
monitored axis was detected significantly faster than in the manually 
controlled axis. 

The incongruity of the above results reveals the necessity to specify 
under what conditions monitors are better failure detectors than 
controllers. Curry and Eprath (1976), drawing upon the work of Young 
(1969), have developed a model to predict whether monitors or active 
controllers will be better failure detectors under certain conditions. 
According to model predictions, monitors will provide faster detection 
latencies "if the control task requires considerable attention to steering 
displays, if there is slow adaptation on the part of the controller, or if 
there is a low signal to noise ratio in the control residual" (Curry & 
Eprath, 1976, p. 143). This model is consistent with results reported by 
previous investigators. This ability to predict whether system monitors or 
controllers will provide superior performance represents a significant step 
in understanding human failure detection abilities. 

Monitoring behavior in automatic systems. In the past, automatic 
systems were developed primarily with the intent of reducing operator 











workload. By automating a task which was previously performed by a human 
operator, system designers and engineers speculated that a considerable 
amount of time and effort could be released and channeled into other, more 
important areas of task performance. The operator in early automated 
systems still retained the ability to intervene manually in the case of 
system failures. Under these circumstances, considerations of active 
control versus passive monitoring still constituted relevant design 
alternatives. 

However, automation has recently moved toward extending the 

capabilities of the human operator so that manual ^intervention would cease 
to be a feasible option. Automated tasks are progressively exceeding the 
abilities of human control. In the case of autopilot systems, automation 
was initially introduced with the objective of easing pilot workload 

(Johannsen, Pfendler & Stein, 1976). Nevertheless, subsequent applications 
to supersonic and, eventually, hypersonic aircraft would involve operational 
tasks which the human monitor is quite unprepared to assume in the event of 
a failure. The introduction of automation, in this case, would not 

necessarily ease the load upon the human operator. But rather, the use of 

automatic systems would enhance the operational effectiveness of the 
man/machine system by changing rather than reducing the human contribution 
(Edwards, 1976). It is imperative that experimental research continues to 
be aimed at investigating the workload demands of the human monitoring 
process. 

Quantitative models of human performance are particularly useful 
methods for describing and predicting human monitoring behavior. The 
construction of mathematical models partly depends on concise formulation of 






hypotheses about the human operator. Curry and Gai (1976) have suggested 
several hypotheses for the human monitor based on manual control theory 
(Clement, Mcruer, and Klein, 1971) which form the fundamental basis for many 
modelling approaches: 

1. To accomplish system monitoring functions such as monitoring the 
state of the system and its various subsystems (including displays and 
failure detection systems), the operator uses a variety of models about 
the system and its performance based on his past experience. 

2. To be satisfactory, monitoring systems comprising both animate and 
inanimate components must share certain of the qualitative dynamic 
features of ‘'good 11 failure detection systems of the solely inanimate 
nature. As the adaptive means to accomplish this end, the observer 
must make up for any deficiency of the information displayed by 
appropriate adjustment of his, dynamic information processing. 

3. There is a cost to this adjustment--in workload induced stress, 
concentration of observer faculties, and in reduced potential for 
coping with the unexpected. This cost can also be traded for the cost 
of automatic monitoring systems. In making this trade-off, one may 
allocate part of the task to the human and part to the automatic 
failure detection system, (p. 148) 

Based on the above hypotheses, Curry and Gai (1976) have described a 
model of human detection of changes in mean of a random process. Their 
particular model includes two stages: A linear estimator and a decision 
mechanism. The validity of this model was tested in both tne laboratory and 
in a more realistic setting involving automatic landing systems (Gai and 
Curry, 1976). In both situations, good agreement was found between 
predictions and experimental data. 

Attempts at modelling the human failure detection process have 
continually focused on normative predictions of optimal operator behavior 
(Smallwood, 1967; Sheridan, 1970, Kleinman & Curry, 1977). However, recent 
research has indicated that there may be a significant discrepancy between 
predicted and actual sampling behavior of human monitors (Kvalseth, 1979) 
which may prompt alternative conceptualizations of visual processing as 
suboptimal behavior (Rouse, 1976). 
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Internal models . At the very center of virtually every attempt to more 
precisely model human behavior is the concept of the internal model. 
Veldhuyzen and Stassen (1976) observe that all forms of human behavior 
require some internal representation of the system being observed or 
controlled. The operator continually updates and compares his internal 
model to the actual system he is monitoring or controlling until the 
observed difference exceeds some subjective criterion and a "failure" is 
repo rt ed. 

The internal model has proved to be of great utility in formulating 
quantitative theories of human monitoring performance. Smallwood's (1967) 
model provided a mathematical description of the updating of operator 
information by an internal model of the environment. Sheridan's (1976) 
generalized expected value approach to a model of supervisory control 
utilizes an internal model to predict the new process state resulting from 
any given action and initial process state. A utility function then 
specifies the worth of this change in state at the cost of that action. The 
optimal estimator of Curry and Gai (1976) is simply a Kalman filter based on 
the subject's internal model of the observed process. It is assumed that 
this filter reaches steady state after several observations and the human 
observer uses any error at the filter as an input to the decision mechanism. 
And finally, Rouse (1973) reported that subjects use mental models to 
predict future states from present observed states for discrete linear 
dynamic systems. 

However, predictions based on the internal model concept are not always 
accurate (Veldhuyzen and Stassen, 1976) since: 

1. The structure of the internal model may differ from the structure 

of the system to be controlled or monitored. 
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2. The internal model parameters may differ from the parameters of the 

system to be monitored or controlled. 

3. System information can only be perceived with restricted accuracy. 

4. Disturbances are often not known exactly, (p. 110) 

Given these nonlinear components of human behavior, the internal model 
construct is a useful but limited approach for predicting 
monitoring/controlling performance. 

Investigations should also be directed at describing the development of 
an individual operator's internal model based on task demands and 
requirements. Jagacinski and Miller (1978) suggest that Bayesian decision 
theory can be viewed as an attempt to formalize and externalize a decision 
maker's internal model. However, Tversky and Kahneman (1974) caution that 
people tend to use nonoptimal, stereotypic models of probabilistic processes 
in estimating the likelihood of events. 

Jagacinski and Miller's (1978) research effort utilized a behavioral 
approach which could provide evidence for the use of veridical or 
nonveridical models of controlled processes. By providing the operator with 
a simple task which allowed easy identification of operator failures to 
adequately characterize the response of the plant, this methodology 
permitted measurement of the internal model which could be communicated to 
the performer. Their results revealed the use of nonveridical models and 
indicated orderly changes in the internal model with practice. This 
technique, however, severely restricted the degrees of freedom in the human 
operator's response so that his ability to predict the time course of the 
dynamic system he is controlling could be more directly examined. Several 
critical assumptions involving methodological considerations may also limit 
the generality of these results. In addition, this derived conception of an 
individual operator's internal model does not define the specific. 
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underlying behavioral processes involved. 

The internal model concept has been applied to the control of large 
ships (Veldhuyzen and Stassen, 1977), as well as monitoring a dynamic system 
to detect failures (Wickens and Kessel, 1979a). The information demands 

vrfiich result from the development and utilization of such internal models 
will provide a number of theoretical and applied problems for workload 
assessment research. 

Work!oad Assessment and the Concept of Processing Resources 

Workload assessment research represents a variety of concepts, models 
and methodologies which attempt to quantify the demand placed on an 
operator's limited processing abilities as he performs a particular task. 
System designers are especially interested in obtaining such a workload 
index to characterize the loading tendencies of a particular system. On the 
other hand, psychologists view workload assessment research as a means to 
explore the information processing abilities involved in many aspects of 
human behavior. The specific orientation adopted will strongly influence 
the concepts and techniques which will eventually be used. Wierwille and 
Williges (1978) provide an excellent survey of current workload 
methodologies utilized in both the theoretical and applied areas of 
aviation. 

Workload research has continually suffered from the inability of 
measuring techniques to adequately distinguish between mental and physical 
workload demands. The physiological and psychological state of the 
individual are often confounded in many workload indexes. This limitation, 
however, has not seriously obstructed the progress of engineering 
psychologists in conceptualizing, modeling and measuring the mental effort 







which may be involved in such psychological constructs as internal models, 
processing stages and attentional channels. Moray (1979) has compiled a 
series of papers whose common objective is to develop and clarify the 
theoretical and practical implications of mental workload. 

The prevailing notion of human performance as a compromise between the 
information processing capabilities of an operator and the obvious storage 
limitations of his memory has been supported in part by the theories and 
models of human attention (Keele, 1973). From Broadbent's filter model 
(Broadbent, 1957), to Treisman's attenuation model (Treisman, 1964), to 
Norman's late selection model (Norman, 1968), tHe concept of selective 
attention has been developed in an effort to account for the processing 
limitations of human performance. Attention is itself a very broad area of 
psychological research, and workload occupies a particular niche in 
attempting to quantify the attentional demands inherent in a particular task 
situation. The costs associated with dual task performance, for example, 
have been described by attentional concepts and measured by workload 
assessment techniques in order to provide a sound theoretical basis for the 
hypothetical construct of “processing resources" (Wickens, 1979a). While 
structural theorists (e.g. Keele, 1973; Kerr, 1975) prefer to view 
processing resources as related to the discrete competition of tasks for 
specific processing mechanisms, proponents of capacity theories (e.g. 
Kahneman, 1973; Moray, 1967) emphasize the flexible nature of processing 
resources which permits allocation in response to task demands. 

The structure specific resource model (Kantowitz and Knight, 1976, 
Wickens, 1980) represents a compromise between the structural and capacity 
views of human attention. The notion of a number of separate processing 
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reservoirs (as opposed to an undifferentiated pool) is consistent with many 
results reported in the dual task literature. Wickens (1980) has drawn upon 
the results of many dual task studies in order to develop a useful framework 
for determining the functional composition of these attentional resource 
reservoirs. His efforts have led to several promising candidates for 
resource definition including: stages of processing (perceptua1—ctitral 
processing-response), modalities of input (visual vs. auditory) and output 
(vocal vs. manual) and hemispheres of processing (verbal vs. spatial)(see 
Figure 1). This multiple reservoir view envisions task interference as a 
function of processing pool overlap. / 

Task interaction has been utilized as a common technique for assessing 
workload demands. If a capacity model of processing resources is adopted, 
then workload may be conceptualized as the proportion of total resources 
demanded by a particular task. The higher the workload, the less "residual 
capacity" is left available for performing any concurrent task. The 
secondary task technique exploits this relationship by requiring the subject 
to perform two concurrent tasks with explicit instructions to maintain a 
consistently high level of performance on one of the tasks. In order to 
assess the workload demands of the emphasized, or primary task, a secondary, 
or loading task is imposed as a measure of the residual capacity. Secondary 
task performance under dual task requirements is then compared with 
performance of the secondary task alone. This performance difference is 
taken as an index of primary task workload (Ogden, Levine, & Eisner, 1979). 
Assessment of relative workload levels for two pieces of equipment may also 
be accomplished by examining the fluctuations in performance of the 
secondary task. Large decrements in secondary task performance are 
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Figure 1. The structure of processing resources. 
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associated with high workload levels. 

The usefulness of the secondary task methodology, however, may be 
c..v‘' when applied to a multiple resource view of human attention 

(8rown, 1978). Under this model, it is possible to grossly underestimate 
primary task workload if the secondary task probes the wrong resource pool 
(Kantowitz and Knight, 1976). A response loaded secondary task will not 
yield a useful index of primary task workload if the primary task loads the 
perceptual encoding reservoir. However, adopting a multidimensional view of 
workload can explain the failure of the secondary task to reflect variations 
in primary task workload as a mismatch between resource pools demanded by 
the two tasks. 

Adhering to the multidimensionality of workload measurement, Wickens 
and Kessel (1979b) investigated the demands of failure detection in dynamic 
systems according to a stages of processing approach. Failure detection 
performance in both the MA and AU modes was compared to the performance of 
each task alone and when it is performed concurrently with either a critical 
tracking task (Jex, 1967) or a mental arithmetic-memory loading task. The 
results of dual task performance indicated that the critical tracking 
loading task disrupted MA failure detection but not AU, while the converse 
results were obtained for the mental arithmetic-memory loading task. 
Interpreting these results within the framework of a structure specific 
resource model of human attention, the AU mode can be said to share 
processing resources with the mental arithmetic-memory task. These common 
resources reside primarily in the perceptual/central processing reservoir. 
On the other hand, the MA mode and the critical tracking task displayed 
similar processing reservoir overlap vJiich was centralized in the response 
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related pool. Automation of the control function in the AU mode, therefore, 
does not eliminate the demand for processing resources but rather shifts the 
demand to a functionally separate processing stage. 

Prior research in the area of workload and human monitoring behavior 
has suggested that minimal processing resource demands are involved in 
monitoring discrete stimuli (Posner & Boies, 1971; Keele, 1973), as well as 
continuous signals (Levison & Tanner, 1971). However, the research reported 
by Wickens and Kessel (1979b), as well as the research of other 
investigators (Senders, 1964; Isreal, Wickens, Chesney & Donchin, 1980), 
indicates that considerable processing demands are associated with some 
types of monitoring tasks. 

According to the preceding discussion, human monitoring behavior in a 
failure detection task demands processing resources which reside primarily 
in the perceptual/central processing reservoir. This view is based on a 
structure specific view of human attention which emphasizes a stages 
approach to human information processing. The concept of mental processing 
as a set of discrete and serial stages, each with a constant input and 
output, has been a useful and convenient framework for examining the 
structure of mental activity. Psychological research continues to be aimed 
at a more precise delineation of these mental stages and the corresponding 
processes which may account for certain accepted abilities of the human as 
an information processor. 

Sternberg 1 s Additive Factors Method 

The reaction time interval . The most common means for establishing the 
existence and structure of mental events has been the reaction time 
interval. The reaction time interval is a widely used dependent measure 
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which is relatively easy to obtain. Pachella (1974) argues that time itself 
is directly meaningful and not arbitrarily related to some underlying 
construct. The events under study fill real time, and thus, real time is 
the variable of interest. 

Two types of converging operations have attempted to describe the 
activity which takes place during the reaction time interval: the 
subtraction method and the additive factors method. The subtraction method 
(Eriksen, Pollack & Montague, 1970) isolates a particular processing stage 
by construcing two qualitatively different tasks, one of which is believed 
to contain all the mental activities of the first^except for one stage of 
interest. The reaction time difference for the two tasks indicates the 
amount of time the "subtracted" stage adds to total processing time. This 
method assumes that the experimenter has prior knowledge as to the 
sequencing of mental events and that deletion of a stage does not affect the 
activity of other mental stages. On the other hand, the additive factors 
method decomposes the reaction time interval through manipulation of 
experimental factors. These factors influence particular processing stages 
and produce converging patterns of reaction time data for their 
identification. 

Theoretical overview . The additive factors method i« based on 
Sternberg's investigations into the scanning of human memory. The data from 
Sternberg's (1966) character comparison task provides e.idence that human 
scanning is both serial and exhaustive. The results indicate a strong 
linear relationship between the number of items in memory and response 
latency suggestino the presence of a comparison process between test 
stimulus onset and response execution. Each additional item in memory adds 
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approximately 38ms to the response latency. The essentially equivalent 
slopes for positive and negative responses also implies exhaustive search in 
that every item in memory is scanned regardless if a match was made 
previously. 

Sternberg's (1967) subsequent experiments on character recognition 
investigated the nature of test stimulus encoding. Two separate and 
independent operations seemed to be involved in the character recognition 
process: Stimulus encoding and stimulus comparison. The independence 
between these two mental operations demonstrates an instance of additivity 
of two effects on reaction time. The effects Q.f set size (comparison 
duration) and stimulus quality (encoding duration) on mean reaction time are 
independent of their respective levels. Such additivity supports the theory 
of a sequence of stages, one stage influenced by stimulus quality and the 
other by set size (Sternberg, 1969b). 

Sternberg's approach assumes that the reaction time interval is filled 
with a sequence of independent stages of processing. Total reaction time, 
then, is simply the sum of the individual stage durations. When an 
experimental manipulation (factor) affects reaction time for a particular 
information processing task, it changes the duration of one or more of the 
constituent stages of processing. If two experimental manipulations affect 
two different stages, they will produce additive effects on total reaction 
time (see Figure 2). However, if two experimental factors interact, so that 
the effect of one factor is dependent on the level of the other, they must 
affect some stage in common. 

Sternberg (1969a) utilized his character comparison task embedded in a 
series of multifactor experiments to investigate the effects of stimulus 
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Figure 2. Sternberg's additive factors method. 
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quality, set size, response type and frequency of response type on reaction 
time. The data revealed a converging pattern of evidence which suggested 
that four stages of information processing were involved in the task: an 
encoding stage, a comparison stage, a response choice stage, and a response 
execution stage. It is important to note that the additive factors method 
does not provide a description of the stages or the sequence in which they 
occur. These labels result from corroborating evidence from other sources 
which also support a particular stage description or sequence. 

The implication that these separate stages of processing draw from 
independent processing resources has -been supported by dual task research. 
Several experiments have demonstrated that tasks which are perceptually 
loaded can be successfully timeshared with tasks that are primarily response 
loaded (Wickens, 1976; Wickens and Kessel, 1979b), although the functional 
separation between perceptual and central processing resources may not be as 
clearly defined (Shulman & Greenberg, 1971). 

The additive factors logic has been utilized in a variety of 
experimental paradigms to further explore human information processing 
abilities. Sternberg's methodology has been employed in several dual task 
paradigms which have investigated the reaction time data associated with the 
study of the response decoding process (Briggs & Swanson, 1970), the 
localization of the divided attention effect (Briggs, Peters & Fisher, 
1972), and the processing automaticity involved in search tasks (Logan, 
1978), to name a few. These applications of the additive factors method are 
particularly useful within the context of workload assessment since the dual 
task data can provide an index of processing resource overlap between the 
manipulated reaction time task (inferred stage of processing) and the 
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concurrent task (Wickens, 1980). 

Workload applications . The additive factors method has displayed an 
encouraging potential as a methodology for assessing primary task workload. 
Sternberg's character comparison task has been evaluated as a secondary 
measure of primary flying workload with promising results. Spicuzza, 
Pinkus, and O'Donnell (1974) utilized a fixed set procedure for both visual 
and auditory Sternberg stimuli coupled with a simulator flying task. In the 
visual condition, reaction time was plotted as a joint function of four 
levels of memory load and two levels of flight task difficulty. The effects 
of the set size manipulation of central prpcessing^load were additive with 
flight difficulty, showing an increase in the intercept across conditions. 
However, since the experiment did not include an encoding or response 
manipulation, the specific source of this demand could not be localized. In 
the auditory version, this increase in intercept was accompanied by a 
decrease in slope indicating some degree of processing overlap at high 
memory load levels. The Sternberg procedures appeared to yield consistent 
and interpretable data with predominantly linear trends, although important 
modifications are necessary for incorporation into the secondary task 
paradigm. 

Crawford, Pearson and Hoffman (1978) have used the secondary Sternberg 
task as a measure of the reserve information processing capacity involved in 
two levels of flight control and four levels of multifunction switching. 
Slope and intercept variations were reported for the flight control 
conditions, while only intercept differences were observed for the 
multifunction keyboard conditions. These results suggest that flight 
control influences both input-output and central processing stages, while 







anticipation of switching tasks affected the input-output stage only. These 
conclusions demonstrate that workload demands of multifunction switching are 
important considerations as the development and implementation of digital 
avionics information systems becomes increasingly common. 

The additive factors method has been employed as an effective 
instrument for probing the dimensions of workload. The particular framework 
utilized has been derived from the research of Sternberg and others that 
difficulty manipulations in a memory search reaction time task affected 
stages of processing (perceptual encoding-central processing-response). 
Wickens, Derrick, Beringer, and Micalizzi (1980) imposed different levels of 
loading at each of these stages and coupled the Sternberg manipulations with 
a primary tracking task. Dual taste reaction times suggested that tracking 
order interacts with perceptual/central processing load, but is additive 
with response load. Conclusions from this investigation indicated that the 
Sternberg manipulations can selectively delineate the locus of perceptual 
and central processing load from response load. Increasing the order of the 
tracking task seems to demand processing resources from primarily the 
perceptual/central processing reservoirs according to a stages of processing 
approach. 

Similar variations of the Sternberg paradigm have coupled a monitoring 
task with the central processing manipulation to determine the locus of 
monitoring resource demands (Wickens & Micalizzi, 1980). Preliminary 
results are inconclusive in establishing the central processing reservoir as 
the source of monitoring processing load. The present investigation 
represents a follow-on to this monitoring study, requiring subjects to 
passively monitor a dynamic system and detect failures while performing a 








concurrent Sternberg secondary task. The quantitative demands of failure 
detection will be varied across subjects by manipulating the cutoff 
frequency of the random noise function. The Sternberg manipulations will 
load the perceptual and response processing stages. Prior research has 
suggested that this failure detect ion task demands primarily perceptual 
processing resources and, thus, should interact with the perceptual 
Sternberg manipulation. These results would indicate that Sternberg's 
additive factors method could provide an effective tool for exploring the 
multidimensionality of workload demands. 
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METHOD 

Subjects 

Eight right handed male undergraduate students from the University of 
Illinois volunteered to participate in all experimental manipulations. All 
subjects had normal vision and were paid $2.50 per hour plus additional 
bonuses. The degree of right handedness was also evaluated for erch subject 
to insure that the right hand was clearly dominant (Bryden, 1977). 

Apparatus 

Subjects were seated in a booth containing a 10 cm x 8 cm Hewlett 
Packard 1330a cathode ray tube (CRT), a hand control joystick with an index 
finger trigger operated with the left hand, and a spring-return pushbutton 
keyboard operated with the index and middle fingers of the right hand. The 
viewing distance from the subject's eyes to the CRT was approximately 8b cm, 
subtending a visual angle of 5 degrees. A Raytheon 704 sixteen bit digital 
computer with 24k memory was used to generate and control a single axis 
pursuit tracking display, present the Sternberg stimuli, and process sibject 
responses on both tasks. 

Tasks 

Failure detection . This task is similar to the automatic mode (AU) of 
failure detection reported in Wickens and Kessel (1979a). In the present 
study, subjects were required to monitor a single axis pursuit tracking 
display which moved horizontally across the CRT. The target path was driven 
by a summation of two sinusoidal inputs while the autopilot transfer 
function consisted of a pure gain and 20U ms time delay to specify cursor 
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position on the basis of the error. A random noise disturbance was added to 
the output of the cursor. Thus the task might simulate a system following a 
semi-predictable path while compensating for disturbance gusts. System 
failures were simulated by a ten second linear ramp change in dynamics from 
a first order to a second order system. Subjects were instructed to press 
the joystick trigger with the left hand when they thought a failure had 
occurred. Four, five, or six failures occurred randomly during the two 
minute trial. A minimum of eight seconds had to elapse after a detection or 
miss before another failure could occur. As a manipulation of failure 
detection difficulty, the cutoff frequency of the Random noise function was 
varied as an experimental factor (.32 Hz to .5 Hz) within subjects. The 
computer recorded hit latency and false alarms. 

Sternberg task . The general Sternberg paradigm required subjects to 
recognize previously presented spatial information. Specifically, a 
spatially defined target, consisting of a random dot pattern, appeared on 
the CRT for ten seconds prior to each failure detection trial. Each 
presented pattern originated from an alphabetized set of twenty four 
separate and distinct dot patterns adopted from Wickens and Sandry (1980). 
After ten seconds, the dot pattern was removed and a clear box appeared in 
the center of the screen. A series of test stimuli were then presented and 
the subject responded either "yes", a particular test stimulus was identical 
to the memorized stimulus, or "no", the test stimulus was different from the 
memorized stimulus. "Yes" and "no" responses were recorded by pressing the 
upper and lower keys with the right middle and index fingers, respectively. 
The computer recorded reaction time and errors. 

In the perceptually loaded condition, a grid of line segments was 
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placed over the stimulus box in order to hinder the perceptual processing of 
the dot patterns. The mask had been pretested to insure that no dot 
pattern's identity was obliterated. The mask only served to prolong the 
single task reaction times. 

In the response loaded condition, subjects were required to press two 
buttons in succession in order to record a specific response. For a "yes" 
response, the subject pressed the upper key followed by the lower key. The 
second key was to be depressed within a time window of .3 seconds to .6 
seconds following the first. The desired result was a smooth, coordinated 
response which produced slightly higher single task reaction times than 
simply a single key response. Similarly, a "no" response was recorded by 
first pressing the lower key and then the upper key within the .3 second 
window. Nonresponses were recorded by the computer when the subject was 
either too fast (<.3 seconds) or too slow (>.6 seconds) in pressing the 
second key. The reaction time interval began when the first key was 
depressed. 

Experimental Design 

A within subject design was employed where each subject participated in 
all experimental manipulations. The Sternberg conditions included a 
baseline condition, a perceptually loaded condition, and a response loaded 
condition. The failure detection difficulty manipulation varied the cutoff 
frequency of the random noise function from .32 Hz to .5 Hz. Each of these 
task manipulations was performed under both single and dual task conditions. 
All subjects participated in six sessions consisting of two days of practice 
and four days of data collection. Each session lasted one hour and took 
place on consecutive days. The cutoff frequency levels were administered on 





24 


different aays and the particular order was counterbalanced for each subject 
to avoid the bias of any particular sequence. 

Procedure 

The practice days were divided into single task and dual task training 
sessions. All subjects received enough training in the experimental 
conditions to insure relatively stable performance. 

The four experimental days each consisted of fourteen total trials. 
The failure detection difficulty level remained constant throughout a 
particular experimental session- During each session, subjects were 
required to perform two single task failure detection trials, six single 
task Sternberg trials, and six dual task trials. These Sternberg trials 
were administered in four alternating blocks of three single task trials 
followed by three dual task trials. The three Sternberg manipulations 
consisted of a no mask-single key response condition (baseline), a 
mask-single key response condition (perceptual loading), and a no 
mask-double key response condition (response loading). Two replications of 
each Sternberg manipulation were presented to the subject for both single 
and dual task conditions. Each trial lasted approximately two minutes and 
between trials, the subject was given feedback concerning task performance 
and bonus earned. 

Experimenter instructions designated the failure detection task as 
primary so that subject performance on this task in both single and dual 
task conditions should be essentially equivalent. Therefore, secondary 
Sternberg performance should reflect changes in the processing demands of 
the primary task. 

The bonus system reinforced these instructions. The failure detection 


v 












>5 


bonus depended on hit latency and was halved if one false alarm was 
generated. Two false alarms resulted in elimination of the failure 
detection bonus altogether. The Sternberg bonus was contingent on 
acceptable primary task performance (dual task = single task) and based on 
reducing reaction time below the previous day's single task reaction time 
score. Excessive errors also reduced the bonus which could be earned. 
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RESULTS/DISCUSSION 

The summary data for both the failure detection and Sternberg tasks are 
presented in Table 1. Single and dual task Sternberg performance for both 
the perceptual and response load conditions are graphically port»ayed in 
Figure 3. Failure detection performance in the .32 Hz and .5 Hz conditions 
is shown in the top and bottom panels respectively. The experimental 
results indicate that the interaction between perceptual load and the 
presence of the failure detection task was statistically significant, 
F(2,14) = 8.10, p < .01. Under dual task conditions, a significantly greater 
increase in Sternberg reaction Limes was obtained for the mask manipulation 
compared to the no mask condition. As suggested by the data in Fiyure 3, no 
significant positive interaction was found between response load and failure 
detection. However, an instance of underadditivity was found for the .32 Hz 
condition, F(2,14) = 9.81, p < .01. The double key response reaction times 
were not as severely disrupted under dual task demands as the perceptual 
load reaction times. 

The ability of subjects to maintain consistent primary task performance 
for both single and dual task conditions is an important requirement for any 
interpretation of the dual task data. A comparison of the single and dual 
task failure detection dua (see Table 1) revealed essentially equivalent 
performance for these two conditions, F(3,21) = 2.17, p > .122. In most 
cases, subjects were able to maintain superior performance under dual task 
demands. Thus, we can be relatively secure in the knowledge that similar 
amounts of processing resources for the failure detection task were used in 
single as well as dual task conditions. This assumption permits an 
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Table 1 

Mean Task Latencies (seconds) 
for Each Processing Load Condition 


Condition 


Load 

Baseline 

Mask 

Double Key 

Failure Detection 

*' / 

Single Task 




.32 Hz 

5.594 



.50 Hz 

5*201 



Dual Task 




.32 Hz 

4.943 

5.127 

5.220 

.50 Hz 

5.119 

5.094 

5.064 


Sternberg 


Single Task 

02 Hz 

.617 

.687 

.667 

.50 Hz 

.604 

.676 

.674 

Dual Task 

.32 Hz 

• 

-Nl 

CO 

•P- 

.914 

.794 

.50 Hz 

.780 

.906 

.821 
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interpretation of Sternberg reaction time decrements as an indication of 
task manipulations. 

Although maintaining single task failure detection performance under 
dual task demands is an important requirement for any interpretation of the 
reaction time data, an equally important consideration is the ability of 
subjects to avoid utilizing a "resource tradeoff" strategy in producing the 
observed reaction time decrements. Large variations in dual task failure 
detection performance across Sternberg conditions may reflect this strategy 
and could potentially account for the particular pattern of Sternberg data 
shown in Figure 3. If the higher reaction times in the perceptual load 
condition are consistently linked with relatively lower failure detection 
latencies (compared with the response load condition) then a resource 
tradeoff strategy may have been utilized. Under this interpretation, 
processing resources are assumed to be diverted (traded off) from the 
Sternberg task (resulting in higher reaction times) and applied to the 
failure detection task (resulting in lower hit latencies). As a result, 
variations in reaction time performance across Sternberg conditions could be 
explained in terms of subject strategy without reference to competition 
among hypothesized pools of processing resources. 

The presence or absence of such a tradeoff can be illustrated through 
the use of a performance operating characteristic (POC) (see Figure 4). The 
efficiency level of the two tasks performed concurrently can be represented 
within the POC space. Single task performance is indicated by the point of 
intersection of the POC with the two axes. Dual task performance is 
identified as a single point within the space representing the decrement 
score on both tasks relative to their respective single task performance 
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Figure 4. Hypothetical representation of 
dual task performance within the POC space. 
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levels. Shifts along the positive diagonal toward the southwest direction 
represent improvements in time sharing efficiency. Shifts along the 
negative diagonal represent variations in resource allocation policy. 

In order to compare tasks which utilize different dependent variables, 
the performance measure of each task is converted to a common dimensionless 
unit such as a normal deviate (Wickens, Mountford, & Schreiner, 1979). In 
the present study, dual task difference scores for both the reaction time 
and hit latency measures were divided by the respective mean standard 
deviations and plotted within the POC spaced for .32 Hz and .5 Hz 
manipulations (see Figures 5 & 6). A comparisem of these dual task 
difference scores along a common measuring scale reveals a clear separation 
of respective POCs for the perceptual and response load conditions. The 
perceptual load condition disrupted dual task efficiency to a much greater 
extent than in the response load condition. 

The results of the analysis of variance support the general impression 
conveyed by the respective POCs. A comparison of mean dual task failure 
detection hit latencies across Sternberg conditions reveals no significant 
variations, F(2,14) = .234, p > .794. The relatively higher perceptual load 
reaction times observed under dual task conditions were not necessarily 
accompanied by correspondingly lower hit latencies. The variation of dual 
task failure detection performance is not large enough to account for the 
larger decrements in reaction time performance. 

It is also important to insure that the reaction time differences 
between the perceptual and response ’oad conditions did not result from a 
speed/accuracy tradeoff. Table 2 contains a summary of the error data for 
both the failure detection and Sternberg tasks. The results indicate the 










NORMALIZED DECREMENT SCORES 


; v —^ PERFORMANC 

y 

\\ \ \ / 

\ \ \ 

\ \ V 

'"v/\ 

\ V/ \ 

\/^ \ 

GOOD X % \ 

PERFORMANCE / \ \ \ 

/ X \ X \ 

X-3v \ 

/ •—• PERCEP LOAD Ck.^ X 

/ D—Q RESP. LOAD *'*»<*. \ 

/ 0—0 BASELINE N 


FAILURE DETECTION 
NORMALIZED DECREMENT SCORES 
(DfreQ= .32 Hz) 


Figure 5. POC representation of dual task performance 
for the .32 Hz cutoff frequency manipulation. 
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Figure 6. POC representation of dual task performance 
for the .5 Hz cutoff frequency manipulation. 
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Table 2 

Error Data for the Failure 
Detection and Sternberg Tasks 

Condition 

Load Baseline Mask Double Key 


False Alarms 


Single Task 




• 32 Hz 

.063 



.50 Hz 

.219 



Dual Task 




.32 Hz 

.219 

.375 

.313 

.50 Hz 

.656 

.219 

.281 


Percentage 

of Sternberg Errors 


Single Task 





2.33 

3.68 

5.43 

Dual Task 




.32 Hz 

2.89 

4.33 

6.48 

.50 Hz 

3-20 

3.U 

6.50 






Sternberg errors were significantly greater in the response load condition, 
F(2,14) = 6.22, p < .05. However, this error variation could be due to the 
increased opportunity for error in the double key response condition 
(recognition errors and double key response errors). In addition, there was 
no interaction between Sternberg condition and single/dual task demands for 
reaction time errors. In other words, although the relative percentage of 
errors varied across Sternberg conditions, this variation was consistent for 
both single and dual task conditions. A speed/accuracy tradeoff explanation 
of the results could not be applied to those interpretations of the reaction 
time data which are concerned with performance variations as a function of 
Sternberg condition and single/dual task demands. 

The effect of the various experimental conditions on the number of 
false alarms appeared to be generally insignificant, although under dual 
task demands, there was a significant diffe»-nce between the cutoff 
frequency manipulations for the baseline (no mask- single key) Sternberg 
condition, F(2,14) = 6.86, p < .01. However, comparisons with corresponding 

hit latencies does not indicate that this difference was in the direction of 
a speed/accuracy tradeoff. 

These experimental results provide at least some support for the main 
hypotheses advanced in the beginning of this paper. First, the significant 
interaction between perceptual load and failure detection demands indicates 
some degree of processing resource overlap between these two tasks within 
the framework of the additive factors method. Second, the lack of a 
significant interaction between response load and failure detection demands 
provides evidence for the notion of a separation of the respective 
processing resource pools. The underadditivity observed in the .32 Hz 
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condition nay be attributed to an increased mobilization of response related 
processing capacity at higher levels of workload which could account for the 
reduced slope in the dual task condition. 

The failure detection task used in this study appears to be primarily 
perceptually loaded. This conclusion is consistent with previous studies 
(Wickens & Kessel, 1979b) which investigated the resource demands of failure 
detection with a different secondary task. In addition, these results also 
support a stages of processing dimension for the structure specific resource 
model which is particularly applicable to workload investigations. The 
general Sternberg paradigm utilized in this study * has shown promise as a 
technique for probing the multidimensionality of workload demands within the 
context of dual task methodology. 

Comparisons of the dual task data obtained for the .32 Hz and .5 Hz 
cutoff frequency manipulations must be accompanied by cautious 
interpellation of the experimental results. One aspect of the data which is 
not immediately interpretable within a resource competition model concerns 
the difference between single and dual task failure detection hit latency 
for the two cutoff frequency manipulations. The average single task failure 
detection hit latency for the .32 Hz manipulation was considerably higher 
than either its dual task value or the single or dual task latencies for the 
.5 Hz condition (see Figure 7). This result suggests that dual task 
requirements actually increased failure detection performance in the .32 Hz 
condition. This might be explained in terms of relative arousal levels. The 
slower dynamics of the .32 Hz system may have induced a lower level of 
arousal vrfiich contributed to the consistently higher single task hit 
latencies in this condition. However, under dual task conditions, the level 
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Figure 7. Single & dual task failure detection hit latencies 
for the .32 Hz & .5 Hz cutoff frequency manipulations. 
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of arousal increased in the .32 Hz condition to a level more comparable to 
the .5 Hz condition, and the performance in each condition was considerably 
more equivalent. Interpretations of the cutoff frequency manipulation as a 
manipulation of task difficulty are not clearly supported by the data even 
though, a priori, the increased velocity component in the .5 Hz condition 
would seem to re. der this task subjectively more difficult. 

The average dual task Sternberg data did not vary significantly between 
the two cutoff frequency manipulations with the exception of the response 
load condition. The lower response load reaction time in the .32 Hz 

•*' . f 

manipulation was primarily responsible for the significant interaction 
between the Sternberg conditions and the cutoff frequency manipulations for 
dual task reaction time, (F(2,14) = 4.20, p < .05. It is not clear whether 
this difference in response load reaction time is an independent effect or 
whether it can be explained in terms of relative arousal levels. 

Perhaps the most important contribution of this study has been to 
provide evidence for the utility of the general Sternberg paradigm in 
assessing the locus of processing resource demands for a particular primary 
task. This procedure is especially appropriate for probing the 
multidimensional aspects of the generalized workload concept. Workload 
assessment continues to be an important activity in the human factors 
evaluation of complex system interactions. Although criticisms of the 
additive factors method (Pachella, 1974) and alternate conceptions of the 
structure of the reaction time interval (McClellan, 1978) may weaken the 
theoretical basis for the Sternberg methodology, this method may still 
provide some degree of practical application in localizing the workload 
effects involved in man/machine systems. 
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